Fault-Tolerant Parallel Programming with Atomic Actions
نویسنده
چکیده
The Pact (parallel actions) parallel programming environment provides an easy-to-use parallel execution and synchronization model based on task parallelization. To give the programmer an abstraction for global data (even on distributed memory machines) the Pact runtime system uses virtual shared memory. Execution’s efficiency is improved with data-dependent dynamic load balancing and latency-masking by multithreaded servers. Fault tolerance in Pact is based on atomic actions and is guaranteed by the runtime system in a fully user-transparent way. This article describes the Pact runtime system’s design together with its logging and recovery algorithms for an implementation on a massively parallel distributed memory computer.
منابع مشابه
Pact – A Fault Tolerant Parallel Programming Environment
Pact is a parallel programming environment relieving the programmer from the burdens of parallel programming which are not really necessary to write efficient parallel programs. This is done by providing a simple synchronization model and virtual shared data with user-defined granularity and automatic consistency control. Pact guarantees user-transparent fault-tolerance with low overhead by usi...
متن کاملCSP Methods for Identifying Atomic Actions in the Design of Fault Tolerant Concurrent Systems
Limiting the extent of error propagation when faults occur and localizing the subsequent error recovery are common concerns in the design of fault tolerant parallel processing systems. Both activities are made easier if the designer associates fault tolerance mechanisms with the underlying atomic actions of the system. With this in mind, this paper has investigated two methods for the identific...
متن کاملCSP Methods for IdentiQing Atomic Actions in the Design of Fault Tolerant Concurrent Systems
Limiting the extent of error propagation when faults occur and localizing the subsequent error recovery are common concerns in the design of fault tolerant parallel processing systems. Both activities are made easier if the designer associates fault tolerance mechanisms with the underlying atomic actions of the system. With this in mind, this paper has investigated two methods for the identific...
متن کاملStabilis: A Case Study in Writing Fault-Tolerant Distributed Applications Using Persistent Objects
This paper presents Stabilis, a fault-tolerant object-oriented distributed database management system that has been written as an exercise in persistent programming. Sta-bilis is implemented on top of Arjuna, an object-oriented programming system that provides the basic mechanisms for fault tolerance and distribution. The computational model used by Arjuna is based upon the concept of using ato...
متن کاملImplementing Atomic Actions in Ada 95
Atomic actions are an important dynamic structuring technique that aid the construction of fault-tolerant concurrent systems. Although they were developed some years ago, none of the well-known commercially-available programming languages directly support their use. This paper summarizes software fault tolerance techniques for concurrent systems, evaluates the Ada 95 programming language from t...
متن کامل